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To obtain a landscape of gross genetic alterations in small cell lung cancer (SCLC), genome-wide copy number analysis and 
whole-transcriptome sequencing were performed in 58 and 42 SCLCs, respectively. Focal amplification of known oncogene 
loci, MYCLI (Ip34.2), MYCN (2p24.3), and MYC (8q24.2l), was frequently and mutually exclusively detected. MYCLI and 
MYC were co-amplified with other regions on either the same or the different chromosome in several cases. In addition, 
the 9p24. 1 region was identified as being amplified in SCLCs without amplification of MYC family oncogenes. Notably, 
expression of the K.IAAI432 gene in this region was significantly higher in KIAAI432 amplified cells than in non-amplified 
cells, and its mRNA expression showed strong correlations with the copy numbers. Thus, KIAAI432 is a novel gene acti- 
vated by amplification in SCLCs. By whole-transcriptome sequencing, a total of 60 fusion transcripts, transcribed from 95 
different genes, were identified as being expressed in SCLC cells. However, no in-frame fusion transcripts were recurrently 
detected in >2 SCLCs, and genes in the amplified regions, such as PVTI neighboring MYC and RLF in MYCLI amplicons, 
were recurrently fused with genes in the same amplicons or with those in different amplicons on either the same or differ- 
ent chromosome. Thus, it was indicated that amplification and fusion of several genes on chromosomes I and 8 occur 
simultaneously but not sequentially through chromothripsis in the development of SCLC, and amplification rather than 
fusion of genes plays an important role in its development. © 2013 Wiley Periodicals, Inc. 



INTRODUCTION 

Lung cancer is the leading cause of cancer death 
worldwide, and accounts for 18% of total cancer 
deaths in a year (Jemal et al., 2011). In particular, 
most of small cell lung cancer (SCLC) cases are 
diagnosed after metastatic spread of the diseases, 
and only 5% of SCLC patients survive beyond 5 
years after diagnosis (Worden and Kalemkerian, 
2000; Cooper and Spiro, 2006). Therefore, for the 
improvement of patients' outcome in this disease, 
it is necessary to identify druggable targets that are 
activated by genetic alterations in SCLC cells. 
However, since only a limited fraction of SCLC 
cases are treated by surgery and most of them are 
treated by chemotherapy and/or radiotherapy, tu- 
mor tissues are rarely available for molecular analy- 
sis. For this reason, only a few activating genetic 
alterations have been identified to date in SCLC 
cells, including amplification of the MYC family 



oncogenes, MYCLI (lp34), MYCN (2p24), and MYC 
(8q24) (Wistuba et al., 2001). Recently, whole-ge- 
nome profiling has been applied to further obtain 
information about copy number alterations, point 
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mutations, and fusions in SCLCs (Kim et al., 2006; 
Campbell et al., 2008; Pleasance et al., 2010; 
Voortman et al., 2010; Dooley et al., 2011; Peifer 
et al., 2012; Rudin et al., 2012). The results indicated 
that copy number gains occur in various chromo- 
somal regions, including the JAK2 (9p24), FGFR1 
(8pl2), TNFRSF4 (lp36), DAD1 (14qll), BCL2L1 
(20qll), BCL2L2 (14qll), FAK (8q24), NF1B 
(9p23), and SOX2 (3q26) genes, in SCLCs. As for 
the gene fusions, the PVT1 gene that is immediately 
downstream of the MYC gene at 8q24 and the CHD7 
gene at 8ql2 with copy number alterations were 
found to be fused in the H2171 and Lul35 cell lines 
(Campbell et al., 2008; Pleasance et al., 2010). How- 
ever, since most genetic studies in SCLC have been 
done using cultured cell lines, genetic alterations 
accumulated in fresh SCLCs in vivo are still unclear. 

In this study, to obtain a landscape of gross 
genetic alterations in both fresh tumors and cell 
lines, genome-wide copy number analysis was per- 
formed for 33 fresh tumors and 25 cell lines to 
identify genes amplified in SCLCs. In parallel, 
whole-transcriptome sequencing was performed 
for 19 fresh tumors and 23 cell lines to identify 
fusion genes expressed in SCLCs. By copy num- 
ber analysis, a novel chromosomal region amplified 
in a mutually exclusive manner with MYC family 
genes was identified, and genes overexpressed 
accompanied by gene amplification in this region 
was further identified. By combining the results of 
copy number analysis with those of whole-tran- 
scriptome sequencing, it was further revealed that 
fusion transcripts were often expressed from genes 
in several amplified regions, suggesting that ampli- 
fication and fusion of genes occur simultaneously 
but not independently by chromothripsis in the 
development of SCLC. 

MATERIALS AND METHODS 

Patients and Tissues 

Sixty-two tumors and corresponding non-cancer- 
ous tissues were obtained at surgery or autopsy 
from 1985 to 2010 at the National Cancer Center 
Hospital, Tokyo, Saitama Medical University, Sai- 
tama, and University of Tsukuba, Ibaraki, Japan 
(Supporting Information Table S1A). Genomic 
DNA was extracted with a QIAamp DNA mini kit 
(Qiagen, Hilden, Germany). Total RNA was 
extracted using TRIzol reagent (Invitrogen, 
Carlsbad, CA), purified by an RNeasy kit (Qia- 
gen), and reverse-transcribed to cDNA by using 
the Superscript III First-Strand Synthesis System 



(Invitrogen) with random hexamers according to 
the manufacturer's instructions. This study was 
performed under the approval of the Institutional 
Review Board of the National Cancer Center. 

Cell Lines 

Twenty-five cell lines were used in this study 
(Supporting Information Table SIB). HCC33, 
N417, H69, H82, H1607, H1963, H2107, H2141, 
and H2171 were obtained from Dr. J. D. Minna 
(University of Texas Southwestern, Dallas), H526 
and H841 from Dr. C. C. Harris (NCI, Bethesda), 
Ms 18 from Dr. E. Shimizu (Tottori University, 
Tottori, Japan), and Lu-series from Dr. T. Tera- 
saki (National Cancer Center, Tokyo, Japan). 
Other cell lines were obtained from the American 
Type Culture Collection or the Japanese Collec- 
tion of Research Bioresources. Genomic DNA was 
extracted as described previously (Iwakawa et al., 
2011). Poly-A( + ) RNA was extracted with a Fast 
Track mRNA isolation kit (Invitrogen) and 
reverse-transcribed to cDNA as described above. 

Genome-wide Copy Number Analysis 

Copy number analysis was performed using 
SNP-Chips for human 250K Nsp SNP arrays 
(Affymetrix, Inc., Santa Clara, CA). Methods used 
for the analysis were previously described (Naka- 
nishi et al., 2009; Iwakawa et al., 2011). Copy 
numbers were determined using the Copy Num- 
ber Analyzer for Affymetrix GeneChip Mapping 
Array (CNAG) software (Nannya et al., 2005; 
Yamamoto et al., 2007). 

Whole-transcriptome Sequencing 

cDNA libraries for RNA sequencing were pre- 
pared using the mRNA-Seq Sample Prep Kit (Ulu- 
mina, San Diego, CA) according to the 
manufacturer's protocol. Briefly, poly-A( + ) RNA 
purified from 4 u,g of total RNA extracted from 
tumors or 0.1 p.g of poly-A(+) RNA extracted from 
cell lines was fragmented in a fragmentation buffer, 
and used for double-stranded cDNA synthesis. Af- 
ter ligation of the paired-end (PE) adapter, a frac- 
tion of 300-350 bp was gel-purified and amplified 
with PGR. The resulting libraries were subjected 
to the PE sequencing of 50-bp reads on the Ge- 
nome Analyzer IIx (GAIIx) (Illumina). 

Detection of Fusion Transcripts 

PE reads derived from fusion transcripts were 
searched for as recently described (Kohno et al., 
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2012). Briefly, PE reads were mapped on human 
reference RNA sequences deposited in the 
RefSeq database using the BOWTIE program 
(version 0.12.5), and PE reads in which both 
reads were mapped on different RNA sequences 
were assembled to "clusters". Paired-clusters 
consisting of >10 PE reads in each sample, for 
which PE reads did not appear in any of three 
non-cancerous lung tissues, were picked up. 
Paired-clusters mapped within a gene region or a 
neighboring-gene region (>100 kb in the genome 
and the same strand) were removed due to the 
possibility of alternative splicing and read- 
through transcription. Junction reads encompass- 
ing the fusion boundaries were searched using 
the MapSplice (version 1.14.1) software with 
modifications. Transcripts that were supported by 
>10 PE reads and >10 junction reads were 
defined as gene fusions. 

Reverse Transcription (RT)-PCR and Sanger 
Sequencing 

cDNA was amplified by PGR using KAPA Taq 
DNA Polymerase (KAPA Biosystems, Woburn, 
MA). PGR products were directly sequenced in 
both directions using the BigDye Termination kit 
and an ABI 3130x1 DNA Sequencer (Applied Bio- 
systems, Foster City, GA). 

Real-time RT-/Genomic-PCR 

The amount of mRNA was quantified using 
TaqMan Gene Expression Assays (Applied Bio- 
systems). The copy number of gene was deter- 
mined by TaqMan Copy Number Assay 
(Applied Biosystems). Primers are listed in Sup- 
porting Information Table S1G. HPRT1 and 
RPPH1 were used as references for mRNA and 
copy number analyses, respectively. Real-time 
PCR was performed using the ABI 7900HT real- 
time PCR system (Applied Biosystems). Data 
was analyzed by ABI RQ Manager vl.2 for 
mRNA analysis and ABI Prism 7900HT 
Sequence Detection Software v2.3 for copy num- 
ber analysis. 

Microarray Experiments and Data Processing 

Two micrograms of total RNA were labeled 
using a 5X MEGAscript T7 kit (Ambion, Inc., 
Austin, Texas) and analyzed by U133Plus2.0 
arrays (Affymetrix), and data was processed by the 
MAS5 algorithm as described previously 
(Okayama et al., 2012). 



RESULTS 

Amplified Genes Identified by Genome-wide Copy 
Number Analysis 

A total of 58 SCLCs, consisted of 33 fresh 
tumors and 25 cell lines (Supporting Information 
Table S1A, B), were subjected to 250K SNP array 
analysis, and all genomic regions with > 5 copies 
in > 5 consecutive SNP loci were first picked up 
as the amplified regions in the SCLC genomes. 
However, by these criteria, whole chromosomes or 
whole chromosomal arms were more frequently 
picked up than focal chromosomal regions in vari- 
ous chromosomes among various tumors and cell 
lines. Therefore, amplified chromosomal regions 
defined as segments of >5 consecutive SNP loci 
with estimated copy numbers of >6 were next 
picked up from each SCLC. Ten amplified 
regions were identified on chromosomes lp, 8q, 
9p, 12p, and 19p in 7 of the 33 fresh tumors (Sup- 
porting Information Table S2). Sizes of amplified 
regions ranged from 0.05 to 3.61 Mb (mean ± SD 
= 1.06 ± 1.25 Mb), and 110 genes were mapped 
in these regions. Forty-seven amplified regions 
were identified on chromosomes lp, 2p, 8q, 9p, 
12p, 14q, 17q, and 20q in 13 of the 25 cell lines 
(Supporting Information Table S2). Sizes of ampli- 
fied regions ranged from 0.08 to 4.22 Mb (mean ± 
SD = 0.67 ± 0.81 Mb), and 211 genes were 
mapped in these regions. Therefore, various chro- 
mosomal regions were identified as being focally 
amplified by the criteria of copy numbers >6, sizes 
of amplified regions were similar in fresh tumors 
and cell lines, and the several amplified regions in 
fresh tumors overlapped with those in cell lines 
(Supporting Information Tables S2). Accordingly, 
commonly amplified regions were determined by 
comparison of amplified regions among all the 58 
SCLCs, including both fresh tumors and cell lines. 
Eight regions on chromosomes lp, 2p, 8q, and 9p 
were commonly (>2 SCLCs) amplified in these 
SCLCs (Table 1). Sizes of the regions ranged from 
0.03 to 0.77 Mb (mean ± SD = 0.25 ± 0.23 Mb), 
and 34 genes were mapped in these regions. 
Three of the 8 regions contained MYC family 
oncogenes, MYCL1, MYCN, and MYC, respec- 
tively, known to be frequently amplified in 
SCLCs (Wistuba et al, 2001; Kim et al., 2006; 
Voortman et al., 2010; Larsen and Minna, 2011). 

MYCL1 and MYCN were co-amplified with 
TRIT1 at lp34.2 and MYCNOS at 2p24.3, respec- 
tively, in 6 and 2 SCLCs. In the 8q24.21 region, 
MYC, MIR1204, and PVT1 were co-amplified in 6 
SCLCs. Copy number breakpoints of amplified 



Genes, Chromosomes & Cancer DOI 10.1002/gcc 



AMPLIFICATION AND FUSION IN SMALL CELL LUNG CANCER 80S 



TABLE I. Chromosomal Regions and Genes Commonly Amplified in Small Cell Lung Cancers 
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Candidate target genes are described in bold. 



regions at 8q24.21 were mapped in the PVTI gene 
in five of the six SCLCs (Supporting Information 
Tables S2; Supporting Information Fig. SI). The 
lp34.3 region was co-amplified with MYCLI at 
lp34.2, and the 8ql2.2 regions were co-amplified 
with MYC at 8q24.21, respectively, in several 
SCLCs (Supporting Information Fig. S2). There- 
fore, occurrence of complicated intrachromosomal 
rearrangements was suggested in the process of 
MYCLI and MYC amplification, resulting in the 
co-amplification of several other genes on chromo- 
somes 1 and 8, respectively. 

In addition to the regions on chromosomes 1, 2, 
and 8, two novel commonly amplified regions 
were identified on chromosome 9p, 9p23, and 
9p24.1 (Table 1; Fig. 1). The 9p23 region includ- 
ing FLJ41200 was amplified in 2 SCLCs, whereas 
the 9p24.1 region, including PLGRKT, CD274, 
PDCD1LG2, KIAA1432, and ERMPI, was ampli- 
fied in three SCLCs. Both regions were co-ampli- 
fied in the HI 607 cell line, whereas only the 9p23 
region was amplified in the H446 cell line and 



only the 9p24.1 region was amplified in two fresh 
tumors (Supporting Information Fig. S2). On chro- 
mosome 9p, NFIB at 9p23 and JAK2 at 9p24.1 
were reported to be amplified in SCLC (Voortman 
et al., 2010; Dooley et al., 2011). However, NFIB 
and JAK2 were amplified only in one SCLC, 
respectively. Therefore, these two genes were not 
mapped in the commonly amplified regions on 
chromosome 9p (Supporting Information Table 
S2; Table 1; Fig. 1). 

Expression of Amplified Genes 
on Chromosome 9p 

Five genes, PLGRKT, CD274, PDCD1LG2, 
KIAA1432, and ERMPI, were mapped in the com- 
monly amplified region at 9p24.1 (Table 1). To 
determine which genes were overexpressed by 
gene amplification, their mRNA expression was 
profiled in 19 fresh tumors (Supporting Informa- 
tion Fig. S3A). These five genes were amplified in 
one of the 19 tumors, SM09-010T. Expression of 
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Figure I. Copy number plots of commonly amplified regions on chromosome 9p in four 
SCLCs. Copy numbers determined by 250K SNP array analysis are indicated by bars in colors. 
Genes mapped in two commonly amplified regions were aligned according to the BLAST human 
sequences (Build 37.3) in the NCBI database (http://www.ncbi.nlm.nih.gov/). 



PLGRKT, CD274 and KIAA1432, but not of 
PDCD1LG2 and ERMP1, was distinctly high in 
SM09-010T (Supporting Information Fig. S3B). 
Therefore, PLGRKT, CD274, and KIAA1432 could 
be overexpressed by gene amplification in SCLCs. 

To further determine genes whose expression is 
associated with copy numbers on chromosome 9p, 
we next performed the association study of gene 
expression with gene copy number in 55 SCLCs, 
including 30 fresh tumors and 25 cell lines (Sup- 
porting Information Table S1A, B). In addition to 
PLGRKT, CD274 and KIAA1432 at 9p24.1 and 
FLJ41200 at 9p23 as genes commonly amplified in 
SCLCs, NFIB at 9p23 and JAK2 at 9p24.1 were 
also subjected to the analysis. Expression of 
CD274 and KIAA1432 in amplified cells was signif- 
icantly higher than that in non-amplified cells (P 
< 0.05) (Fig. 2A), and five genes except FLJ41200 
showed significant associations between the levels 
of mRNA expression and copy numbers (P < 
0.05) (Fig. 2B). Notably, KIAA1432 showed the 
strongest association between them (P = 1.04E- 
06). Therefore, KIAA1432 is the strongest target 
activated by gene amplification on chromosome 9p 
in SCLC. If there is another target in the 9p23 
region, NFIB is more likely to be the one than 
FLJ41200. 



Copy Numbers of Amplified Genes Defined by 
Real-time Genomic-PCR 

To further investigate the prevalence and speci- 
ficity of gene amplification in the chromosome 1, 
2, 8, and 9 regions in SCLCs, 87 SCLCs com- 
prised of 62 tumors and 25 cell lines were sub- 
jected to real-time genomic-PCR analyses. Among 
them, 33 tumors and 25 cell lines were also sub- 
jected to 250K SNP array analyses (Supporting 
Information Table S1A, B). Three MYC family 
genes and six genes on chromosome 9p were ana- 
lyzed, and criteria of gene amplification by real- 
time genomic-PCR were defined as DNA copy 
number ratios >3 that was equivalent to the copy 
number >6. Five of the 33 tumors and 12 of the 
25 cell lines showed amplification of 1-5 of the 9 
genes by 250K SNP array analysis (Supporting 
Information Tables S2). Except for two SCLCs, 
the HI 184 cell line with MYCL1 amplification and 
the SM09-011T1 tumor with MYC amplification, 
amplification of genes defined by 250K SNP arrays 
was consistently detected by real-time genomic- 
PCR (Supporting Information Table S3). On the 
other hand, in seven SCLCs without amplification 
by 250K SNP array analyses, 1-6 genes were 
judged as being amplified by real-time genomic- 
PCR analyses. Inconsistencies of the results were 
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Figure 2. Association of copy numbers with expression levels in 
the 9p23-24 genes in SCLC cells. Levels of mRNA expression were 
quantified as ACt values using the HPRTI gene as a control. Levels rel- 
ative to normal lung were calculated using Human Lung Poly-A( + ) 
RNA (Clontech) as the calibrator. (A) Levels of mRNA expression 



(log2) quantified by real-time RT-PCR in amplified (+) and not ampli- 
fied (— ) SCLCs. P-values by Student's T-test for differences are shown. 
(B) Correlation of copy number ratios by real-time genomic-PCR with 
mRNA expression levels by real-time RT-PCR among 55 SCLCs. 



due to the following reasons. Firstly, only a small 
region including MYC covered by two SNP 
markers was amplified in the Lul35 cell line; 
therefore, this region was not defined as being 
amplified by SNP array analysis, but defined as 
being amplified by real-time genomic-PCR analy- 
sis. Secondly, in R-511T and SM09-006T, copy 
number loss of the reference locus, RPPH1, on 
chromosome 14 enhanced the degree of amplifica- 
tion in several genes by real-time genomic-PCR 
analysis. Thirdly, in H2195 and R-506M1, it was 
difficult to define the copy numbers possibly due 
to the heterogeneity of aneuploid cells and the 
presence of contaminated non-cancerous cells, 
respectively. Therefore, in these SCLCs, genes 
with copy number ratios >3 by real-time genomic- 



PCR was defined as five copies by SNP array 
analysis. 

Even with some inconsistencies between the 
results of SNP array analyses and those of real- 
time genomic-PCR analyses, the association 
between them was highly significant (P = 5.58E- 
07 by Fisher's exact test). Therefore, we then 
investigated the prevalence and specificity of gene 
amplification based on the data obtained by real- 
time genomic-PCR analysis. Amplification of 
these genes was detected in 12 of the 62 fresh 
tumors (19.4%) and 13 of the 25 cell lines (52.0%) 
(Table 2). Three MYC family genes were ampli- 
fied in a mutually exclusive manner in 16 SCLCs 
(18.4%). Genes on chromosome 9p were amplified 
in 10 SCLCs (11.5%). Notably, NFIB / FLJ41200 
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TABLE 2. Occurrence of Gene Amplification in 


a Mutually Exclusive 


Manner in 


Small Cell Lung Cancers 




Sample name 


Ip34.2 


2p24.3 


8q24.2l 




9p23 






9p24.l 




MYCLI 


MYCN 


MYC 


NF/6 


rL/4 / lOU 


JAK2 


PLGRKT 


CD274 


KIAAI432 


HI 963 


+ 










- 






- 


HCC33 


+ 










- 






- 


H5I0 


+ 










- 






- 


H2I4I 


+ 










- 






- 


SM09-0I2T 


+ 










- 






- 


S39IT 


+ 










- 






- 


H69 




+ 








- 






- 


H526 




+ 








- 






- 


I59IT 




+ 








- 






- 


R-51 IT 




+ 








- 






- 


H82 






+ 






- 






- 


N4I7 






+ 






- 






- 


Lul35 






+ 






- 






- 


H2I7I 






+ 






- 






- 


SM09-0I9T 






+ 






- 






- 


H446 






+ 


+ 


+ 


- 






- 


H2I95 


_ 


_ 




+ 


+ 


- 






- 


SM09-008T 










+ 


- 






- 


SM09-006T 








+ 


+ 


+ 


■ 


■ 


+ 


HI 607 










+ 


+ 


+ 


+ 


+ 


R-506M 1 












+ 


+ 


+ 


+ 


SM09-0I0T 














+ 


+ 


+ 


R-5 1 3T 














+ 


+ 


+ 


SM09-004T 














+ 


+ 




I49IM 


















+ 


Amplification rate 


6.9 


4.6 


6.9 


3.4 


5.7 


3.4 


6.9 


6.9 


6.9 
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at 9p23 and JAK2 / CD274/ PLGRKT/ KIAA1432 at 
9p24.21 were not co-amplified in 8 of the 10 
SCLCs, indicating that amplification of these two 
regions occurred independently in most SCLCs. 
Therefore, the presence of target genes for ampli- 
fication in each region was highly suggested. For 
this reason, specificities of 9p23 amplification and 
9p24.21 amplification were independently ana- 
lyzed among the 87 SCLCs. Importantly, four 
genes in the 9p24.1 region were amplified in a 
mutually exclusive manner with MYC family 
genes, whereas the MYC gene was co-amplified in 
one of three SCLCs with NF1B amplification, con- 
sistent with the results of 250K SNP array analyses 
(Supporting Information Fig. S2). 

Identification of Fusion Transcripts by Whole- 
transcriptome Sequencing 

Forty-two SCLCs, consisted of 19 fresh tumors 
and 23 cell lines, were subjected to whole-tran- 
scriptome sequencing to identify fusion genes 
expressed in SCLCs (Supporting Information 
Tables S1A, B and S4). Total read counts ranged 
from 74,378,482 to 93,612,490, and their average 
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was 86,529,758. A total of 60 fusion transcripts, 
transcribed from a portion of 95 genes, were iden- 
tified as being expressed by the criteria of >10 
paired-end (PE) reads for each transcript (Table 
3). Twenty-two of them (36.7%) were in-frame; 
thus, were predicted to produce fusion proteins. 
There was no set of 5'-3' fusion transcript recur- 
rently detected in >2 SCLCs. However, two of 
the 5' partner genes, PVT1 and RLF, were 
detected recurrently in >2 SCLCs (14 pairs in 7 
SCLCs). PVT1 was detected as the 5' partner gene 
of seven fusion pairs with different 3' partner 
genes in five SCLCs. Previously, PVT1 was shown 
to be fused with CHD7 in H2171 and Lul35 
(Campbell et al., 2008; Pleasance et al., 2010). In 
this study, PVT1-CHD7 was detected in H2171 
with >10 PE reads but was not in Lul35. RLF 
was detected as the 5' partner gene of 7 fusion 
pairs with different 3' partner genes in 2 SCLCs. 
RLF was reported as being a fusion gene with 
MYCLI expressed in SCLC cells (Makela et al., 
1991a, 1991b, 1995), and the RLF-MYCL1 fusion 
was also detected in HI 963 with >10 PE reads in 
this study. Three of the five fusion pairs having 
RLF as the 5' partner gene, RLF-MYCL1, 
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RLF-SMAP2 and RLF-FAM132A, were predicted 
to produce fusion proteins. The remaining 46 
fusion transcripts were detected in a single SCLC, 
respectively, and 19 of them were predicted to 
produce fusion proteins. Therefore, none of the 22 
fusion transcripts predicted to produce fusion pro- 
teins was expressed recurrently in multiple SCLC 
cases. 

Amplification of Genes with Fusions 

We next investigated the copy numbers of 95 
genes in the 60 fusion pairs identified by whole- 
transcriptome sequencing (Table 3). Twenty-eight 
of the 5' partner genes detected in 9 SCLCs and 
22 of the 3' partner genes detected in seven 
SCLCs were mapped in the amplified regions. 

In four of the five SCLCs expressing fusion 
transcripts with PVT1 as the 5' partner gene, the 5' 
portions of the PVT1 gene at 8q24.21 were ampli- 
fied, indicating that chromosomal breaks had 
occurred in the PVT1 locus (Supporting Informa- 
tion Fig. SI). Three of seven genes fused with 
PVT1 were also amplified. Therefore, it was indi- 
cated that chromosomal breaks often occur in the 
PVT1 locus during the process of MYC amplifica- 
tion. For this reason, we further searched for PVT1 
fusion transcripts expressed in the H2171, Lul35, 
and N417 cell lines, which showed co-amplifica- 
tion of three regions on chromosome 8q (Table 1). 
In addition to PVT1-CHD7 (PE reads = 345), 
PVT1-SLC7A7 (PE read = 219), and PVT1- 
CCNB1IPI (PE reads = 114) were detected in 
H2171. As described above, no PVT1 fusion was 
detected in Lul35. PVT1-CLVS1 (PE read = 34) 
and PVT1-ASPH (PE reads = 27, junction reads = 
9) were detected in N417. 

RLF, detected as the 5' partner gene of seven 
fusion pairs in two cell lines, H1963 and HCC33, 
was amplified in both cell lines (Table 3). Three 
3' partner genes, MYCL1, COL9A2, and SMAP2, 
fused with RLF in H1963 were also mapped to 
lp34.2 and amplified. Two other 3' partner genes, 
BCL2L1 and HM13, fused with RLF in H1963 and 
mapped to 20ql 1.21, were also amplified. The 
remaining two 3' partner genes, UBE2J2 and 
FAM132A at lp36.33, were also amplified in 
HCC33 with consecutive 2 SNPs. Therefore, pro- 
duction of fusion transcripts with RLF was always 
accompanied by amplification of both the 5' and 3' 
genes, indicating that those genes had fused in the 
process of MYCL1 amplification. 

These results strongly indicate that amplification of 
several regions on chromosomes 1 and 8 simultaneously 



but not sequentially occurs in SCLC cells, and fur- 
ther support that complicated intrachromosomal 
rearrangements occur in the process of MYCL1 or 
MYC amplification, resulting in the co-amplifica- 
tion and fusion of several genes on chromosomes 1 
and 8. Therefore, the PVT1 and RLF loci would 
be hotspots of chromosomal breaks in the process 
of gene amplification in SCLC cells. 

SCLCs with Expression of Multiple Fusion 
Transcripts 

Sixty fusion transcripts were detected in 23 of the 
42 SCLCs (Table 3, Supporting Information Table 
S4), indicating the presence of SCLCs expressing 
multiple fusion transcripts. Indeed, sixteen fusion 
pairs consisted of 15 genes were identified in H1963 
(Supporting Information Fig. S4A). Twelve of the 
15 genes, including AiYCLl, were mapped to the 
lp34.2 amplicon, and the remaining 3 genes were 
mapped to the 20qll.21 amplicon. Seven fusions 
were intrachromosomal among genes at lp34.2 or 
20ql 1.21, while the other nine fusions were inter- 
chromosomal between genes at lp34.2 and genes at 
20qll.21. Therefore, in H1963, complicated chro- 
mosomal rearrangements were likely to have 
occurred in the process of MYCL1 amplification. 

Seven fusion pairs consisted of 14 genes were 
identified in SM09-016T (Supporting Information 
Fig. S4B). Seven and 7 of the 14 genes were 
mapped to chromosomes 3 and 11, respectively. 
The 5' and 3' partner genes for four of the seven 
fusions were mapped to the same chromosomes, 
while those for the remaining three fusions were 
mapped to different chromosomes. Therefore, 
these genes were fused by either intrachromosomal 
or interchromosomal rearrangements. Interestingly, 
no fused genes were amplified in SM09-016T, indi- 
cating that complicated chromosomal rearrange- 
ments had occurred without gene amplification. 
Two to five fusion transcripts were detected in 
eight other SCLCs. Among 24 fusions detected in 
these SCLCs, 18 of them were intrachromosomal 
and the remaining six were interchromosomal. 
Eight of 5' partner genes and four of 3' partner genes 
were mapped to the amplified regions. Therefore, 
intrachromosomal rearrangements seemed to occur 
preferentially in SCLC cells irrespective of the pro- 
cess of gene amplification. 

KIAAI432-JAK2 Fusion Detected in a SCLC with 
9p24.l Amplification 

Interestingly, a KIAA1432-JAK2 fusion transcript 
was detected in SM09-010T with amplification of 
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Figure 3. A KIAAI432-JAK2 fusion detected in the SM09-0I0T tumor with 9p24.2 1 amplifica- 
tion. (A) Copy number plots at 9p24. 1 by 250K SNP array. (B) Schematic presentation of wild- 
type KIAA 1 432 and JAK2 proteins, and the K.IAAI432-JAK.2 fusion protein. Electrophoregram for 
Sanger sequencing of cDNA for the KIAA 1 432-JAK2 fusion transcript is shown below. 



the KIAA1432 gene at 9p24.1 (Table 3; Fig. 3 A). 
Furthermore, three (4.6%) of 65 SCLCs analyzed 
by RT-PCR were shown to express KIAA1432- 
JAK2 fusion transcripts. However, only one of them 
expressed in SM09-010T was predicted to produce 
a fusion protein, although the tyrosine kinase 
domain was disrupted by fusion (Fig. 3B). 

DISCUSSION 

The purpose of this study was to identify genes 
activated by amplification and/or fusion in SCLC. 
By a copy number analysis, 34 genes were identi- 
fied as being frequently amplified in SCLCs. In 
concordance with previous studies, three MYC 
family genes were frequently amplified in SCLCs 
(Wistuba et al., 2001; Kim et al., 2006; Voortman 
et al., 2010; Larsen and Minna, 2011; Sos et al., 
2012). Recently, several MYC inhibitors, including 
Omomyc, BET bromodomain inhibitor and Aurora 
kinase inhibitor, have been reported (Soucek 
et al., 2008; Delmore et al., 2011; Sos et al., 2012); 
therefore, the MYC family gene products could be 
druggable targets in SCLC cells with their 
activation. 



Co-amplification of PVT1 and MIR1204 with 
MYC has been reported in several types of cancers, 
and their oncogenic roles have been also suggested 
(Guan et al., 2007; Huppi et al., 2008; Haverty 
et al., 2009; Schiffman et al., 2010; Sircoulomb 
et al., 2010). However, in this study, only the 5' 
portion of the PVT1 gene including exon 1 was 
commonly amplified, and amplified PVT1 genes 
were often fused with other genes. Since the 3' 
partner genes in five PVT1 fusions were different 
from each other, biological significance of PVT1 
amplification and/or fusion is unclear at present. 
The PVT1 locus could be a hotspot of chromo- 
somal breaks in the process of MYC amplification 
in SCLCs. Since A1IR1204 was always co-amplified 
with MYC, involvement of MIR1204 in the devel- 
opment of SCLC cannot be excluded. Recently, 
PVT1-MYC fusions were detected in >60% of 
medulloblastomas with MYC amplification, and 
these fusions also involved PVT1 exonl and 
MIR1204 (Northcott et al., 2012). These results 
further support that the PVT1 locus is a hotspot of 
chromosomal breaks in the process of MYC ampli- 
fication, although no PVT1-MYC fusions were 
detected in SCLCs. 
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In addition to MYCL1 and MYC, several regions 
on either the same or different chromosomes 
were co-amplified in SCLCs. Furthermore, sev- 
eral genes, especially RLF and PVT1 in the 
MYCL1 and MYC amplicons, respectively, were 
fused with genes in different amplified regions. 
These results strongly indicate that amplification 
of several regions on chromosomes 1 and 8 
occurred simultaneously but not independently/ 
sequentially in these SCLCs. Therefore, several 
genes co-amplified with MYCL1 and MYC were 
likely to be rearranged and amplified together 
with MYCL1 and MYC by a massive genomic rear- 
rangement acquired in a single catastrophic 
event. Recently, a new mechanism for genetic 
instability in cancer cells, chromothripsis, was 
proposed by Stephens et al. (2011). In chromo- 
thripsis, tens to hundreds of chromosomal rear- 
rangements involving localized genomic regions 
can be acquired in a one-off cellular catastrophe. 
Indeed, CHD7 at 8ql2 was shown to be rear- 
ranged in three SCLC cell lines (Campbell et al., 
2008; Pleasance et al., 2010). In this study, ampli- 
fied genes in SCLC cells often showed fusions 
with genes in the same amplicons, different 
amplicons on the same chromosome, or different 
amplicons on different chromosomes. These 
results strongly indicate that amplification of 
MYCL1 and MYC often occurs through chromo- 
thripsis in SCLCs, although the presence of tens 
to hundreds of chromosomal rearrangements in 
particular genomic regions should be confirmed 
by whole genome sequencing. Therefore, target 
genes of amplification on these chromosomes 
would be MYCL1 and MYC, respectively, even 
though multiple regions on chromosomes 1 and 8 
were commonly amplified in SCLCs. 

Two regions on chromosome 9p were also com- 
monly amplified in SCLCs. Notably, amplification 
at 9p24.1 tended to occur in SCLCs without 
amplification of MYC family genes. In contrast, the 
9p23 region including NF1B was co-amplified 
with MYC in H446. Previously, Nflb in a mouse 
SCLC model was shown to be frequently co- 
amplified with Mycll (Dooley et al., 2011), consist- 
ent with the present results. However, 9p24.1 and 
9p23 were independently amplified in most 
SCLCs. Therefore, these regions were unlikely to 
be amplified by chromothripsis, and the 9p23 and 
9p24.1 regions may contain independent target 
genes, respectively. Expression analyses revealed 
that NF1B at 9p23 and KIAA1432 at 9p24.1 were 
overexpressed by gene amplification in SCLCs, 
thus, were strong candidates of genes activated by 



amplification in SCLCs. Recently, KIAA1432 was 
reported to be also amplified and overexpressed in 
breast cancer, thus, is a target gene of amplifica- 
tion not only in SCLC but also in breast cancer 
(Wu et al., 2012). KIAA1432 encodes a partner pro- 
tein, CIP150, of connexin 43 (Cx43) (Akiyama 
et al., 2005). Cx43, a structural protein in the gap 
junction, has been reported as being a tumor sup- 
pressor inactivated in several cancers (Li et al., 
2008; Naus and Laird, 2010; Plante et al., 2011). 
Therefore, it is possible that CIP150 encoded by 
KIAA1432 is involved in the regulation of Cx43 
activities and its overexpression may play a role in 
SCLC development. Further functional studies 
are needed to clarify the biological significance of 
KIAA1432 amplification in SCLC development. 
Interestingly, a KIAA1432-JAK2 fusion was identi- 
fied in a case with KIAA1432 amplification. Various 
fusions with JAK2 have been reported in hemato- 
logical malignancies (Van Roosbroeck et al., 2011). 
These fusions contained the whole tyrosine kinase 
domain and lead to constitutive phosphorylation 
of the kinase (Lacronique et al., 1997; Griesinger 
et al., 2005; Poitras et al., 2008; Nebral et al, 2009; 
Van Roosbroeck et al., 2011). However, the kinase 
domain was disrupted by the KIAA1432-JAK2 
fusion identified in this study. Therefore, it is 
unlikely that JAK2 is activated by fusion with 
KIAA1432 in SCLCs. There might be hotspots of 
chromosomal breakpoints in the JAK2 and 
KIAA1432 loci in the process of 9p24.1 amplifica- 
tion in SCLCs. 

During the preparation of this manuscript, the 
results of comprehensive and integrative genome 
analyses on SCLCs were reported by two groups 
(Peifer et al., 2012; Rudin et al., 2012). Frequent 
amplification of the SOX2 (copy number >4) and 
FGFR1 (copy number >3.5) genes at 3q26.3-q27 
and 8pl2, respectively, were shown in their 
articles. When we used the same criteria (copy 
number >4), the SOX2 and FGFR1 genes were 
amplified in 21 (36.0%) and 7 (12.1%), respec- 
tively, of 58 SCLCs subjected to 250K SNP array 
analysis. However, in this study, by using the cri- 
teria of copy number >6 for detection of focally 
amplified genes, neither SOX2 nor FGFR1 were 
picked up as the amplified genes in SCLCs, 
because extents of amplification for the SOX2 and 
FGFR1 genes were not so high as those for MYC 
family genes and 9p genes. Therefore, in our crite- 
ria, genes with activation by low degree of amplifi- 
cation (3-5 copies) were overlooked. However, 
genes with high degree of amplification (copy 
number >6) were successfully and efficiently 
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picked up from the SCLC genomes. A recurrent 
RLF-MYCL1 fusion was reported in one article 
(Rudin et al., 2012). The RLF-MYCL1 fusion was 
also identified in H1963 in this study, but both 
RLF and MYCL1 were fused with several other 
genes in this cell line, indicating the occurrence of 
chromothripsis in the production of those fusions 
in SCLCs. 

We should also point out here that statuses of 
MYC family gene amplification in some cell lines 
defined in this study were not the same as those 
reported previously. To depict such inconsisten- 
cies more critically and clearly, we prepared Sup- 
porting Information Table S5, in which the 
statuses for MYC family amplification defined in 
this study were summarized together with those in 
three other studies (Kim et al., 2006; Voortman 
et al., 2010; Sos et al., 2012) for each of all the 25 
cell lines analyzed in this study. In 18 of the 25 
cell lines, statuses of MYC family gene amplifica- 
tion were also defined in 1-3 of the other studies. 
In 14 of the 18 cell lines, statuses of MYC family 
gene amplification were consistent among studies. 
However, in the H69 cell line, MYCN amplifica- 
tion was detected in three of the four studies, and 
in the remaining three cell lines, H128, H187, and 
H2107, either MYC or MYCL1 amplification was 
detected only in one of three or four studies. 
These inconsistencies would be due to the differ- 
ences in the criteria of gene amplification among 
the four studies and also could be due to the dif- 
ferences in the methods as well as the platforms 
used for assessing copy numbers of each gene 
among them. 

In this study, we did not refer to somatic muta- 
tions that could be detected by whole-transcrip- 
tome sequencing, because genes with somatic 
mutations that are highly expressed in the cells 
can be only detected by whole-transcriptome 
sequencing. In our preliminary results, various 
types of mutations detected by genome sequenc- 
ing were not detected by whole transcriptome 
sequencing possibly due to the low levels or ab- 
sence of expression. In addition, due to the differ- 
ences in the level of mRNA expression among 
genes analyzed, total read counts of transcripts for 
sequencing varied among genes in each sample. 
Therefore, it was difficult to obtain conclusive 
results for the presence of mutations by whole- 
transcriptome sequencing only. To obtain more 
convincing results, we have to confirm the pres- 
ence of mutations by using several types of ge- 
nome sequencing, such as direct sequencing and 
whole exome/genome sequencing. Accordingly, in 



this study, we did not present the data for possi- 
ble somatic mutations detected by whole-tran- 
scriptome sequencing. In contrast, the presence 
of amplified or fused genes could be easily con- 
firmed by PGR analysis. Therefore, in this 
study, we attempted to compile the list of genes 
that were activated by amplification and/or 
fusion in SCLC cells. Further studies are now in 
progress. 
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